VIS Team: Publications

Publications

Publications related to the VIS Team

Noise simulation for the improvement of training deep neural network for printer-proof steganography

In the modern era, images have emerged as powerful tools for concealing information, giving rise to innovative methods like watermarking and steganography, with end-to-end steganography solutions emerging in recent years. However, these new methods presented some issues regarding the hidden message and the decreased quality of images. This paper investigates the efficacy of noise simulation methods and deep learning methods to improve the resistance of steganography to printing. The research develops an end-to-end printer-proof steganography solution, with a particular focus on the development of a noise simulation module capable of overcoming distortions caused by the transmission of the print-scan medium. Through the development, several approaches are employed, from combining several sources of noise present in the physical environment during printing and capture by image sensors to the introduction of data augmentation techniques and self- supervised learning to improve and stabilize the resistance of the network. Through rigorous experimentation, a significant increase in the robustness of the network was obtained by adding noise combinations while maintaining the performance of the network. Thereby, these experiments conclusively demonstrated that noise simulation can provide a robust and efficient method to improve printer-proof steganography.

Author(s): Telmo Cunha, Luiz Schirmer, João Marcos and Nuno Gonçalves
//
Featured In: 13th International Conference on Pattern Recognition Application and Methods (ICPRAM). Rome, Italy
//
Publication Type: Conference Papers
//
DOI: 10.5220/0012272300003654
//
Year: 2024
//
View File
//
Visit Website

Fused Classification for Differential Face Morphing Detection

Face morphing, a sophisticated presentation attack tech- nique, poses signi cant security risks to face recognition systems. Traditional methods struggle to detect morph- ing attacks, which involve blending multiple face images to create a synthetic image that can match different individ- uals. In this paper, we focus on the differential detection of face morphing and propose an extended approach based on fused classi cation method for no-reference scenario. We introduce a public face morphing detection benchmark for the differential scenario and utilize a speci c data min- ing technique to enhance the performance of our approach. Experimental results demonstrate the effectiveness of our method in detecting morphing attacks.

Author(s): Iurii Medvedev, Joana Alves Pimenta and Nuno Gonçalves
//
Featured In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)
//
Publication Type: Conference Papers
//
DOI: 10.48550/arXiv.2309.00665
//
Year: 2024
//
View File
//
Visit Website

Automatic Validation of ICAO Compliance Regarding Head Coverings: (…) Religious Circumstances

This paper contributes with a dataset and an algorithm that automatically veri es the compliance with the ICAO requirements related to the use of head coverings on facial images used on machine-readable travel documents. All the methods found in the literature ignore that some coverings might be accepted because of religious or cultural reasons, and basically only look for the presence of hats/caps. Our approach speci cally includes the religious cases and distinguishes the head coverings that might be considered compliant. We built a dataset composed by facial images of 500 identities to accommodate these type of accessories. That data was used to ne-tune and train a classi cation model based on the YOLOv8 framework and we achieved state of the art results with an accuracy of 99.1% and EER of 5.7%.

Author(s): Carla Guerra, João Marcos, Nuno Gonçalves
//
Featured In: 2023 International Conference of the Biometrics Special Interest Group (BIOSIG)
//
Publication Type: Conference Papers
//
DOI: 10.1109/BIOSIG58226.2023.10345995
//
Year: 2023
//
View File
//
Visit Website

Impact of Image Context for Single Deep Learning Face Morphing Attack Detection

The increase in security concerns due to techno- logical advancements has led to the popularity of biometric approaches that utilize physiological or behavioral characteris- tics for enhanced recognition. Face recognition systems (FRSs) have become prevalent, but they are still vulnerable to image manipulation techniques such as face morphing attacks. This study investigates the impact of the alignment settings of input images on deep learning face morphing detection performance. We analyze the interconnections between the face contour and image context and suggest optimal alignment conditions for face morphing detection.

Author(s): Joana Alves Pimenta, Iurii Medvedev and Nuno Gonçalves
//
Featured In: 2023 International Conference of the Biometrics Special Interest Group (BIOSIG)
//
Publication Type: Conference Papers
//
DOI: 10.1109/BIOSIG58226.2023.10345999
//
Year: 2023
//
View File
//
Visit Website

MorDeephy: Face Morphing Detection via Fused Classification

Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a sophisticated face recognition task in a complex classification scheme. It is directed onto learning the deep facial features, which carry information about the authenticity of these features. Our work also introduces several additional contributions: the public and easy-to-use face morphing detection benchmark and the results of our wild datasets filtering strategy. Our method, which we call MorDeephy, achieved the state of the art performance and demonstrated a prominent ability for generalizing the task of morphing detection to unseen scenarios.

Author(s): Iurii Medvedev, Farhad Shadmand and Nuno Gonçalves
//
Featured In: 12th International Conference on Pattern Recognition Application and Methods (ICPRAM), Lisbon, Portugal.
//
Publication Type: Conference Papers
//
Year: 2023
//
View File
//
Visit Website

Dealing with Overfitting in the Context of Liveness Detection Using FeatherNets with RGB Images

With the increased use of machine learning for liveness detection solutions comes some shortcomings like overfitting, where the model adapts perfectly to the training set, becoming unusable when used with the testing set, defeating the purpose of machine learning. This paper proposes how to approach overfitting without altering the model used by focusing on the input and output information of the model. The input approach focuses on the information obtained from the different modalities present in the datasets used, as well as how varied the information of these datasets is, not only in number of spoof types but as the ambient conditions when the videos were captured. The output approaches were focused on both the loss function, which has an effect on the actual ”learning”, used on the model which is calculated from the model’s output and is then propagated backwards, and the interpretation of said output to define what predictions are considered as bonafide or spoof. Throughout this work, we were able to reduce the overfitting effect with a difference between the best epoch and the average of the last fifty epochs from 36.57% to 3.63%.

Author(s): Miguel Leão and Nuno Gonçalves
//
Featured In: 12th International Conference on Pattern Recognition Application and Methods (ICPRAM). Lisbon, Portugal
//
Publication Type: Conference Papers
//
DOI: 10.5220/0011639600003411
//
Year: 2023
//
View File
//
Visit Website

Improving Performance of Facial Biometrics With Quality-Driven Dataset Filtering

Advancements in deep learning techniques and availability of large scale face datasets led to signi cant perfor- mance gains in face recognition in recent years. Modern face recognition algorithms are trained on large- scale in-the-wild face datasets. At the same time, many facial biometric applications rely on controlled image acquisition and enrollment procedures (for instance, document security applications). That is why such face recognition approaches can demonstrate the de ciency of the performance in the target scenario (ICAO-compliant images). However, modern approaches for face image quality estimation may help to mitigate that problem. In this work, we introduce a strategy for ltering training datasets by quality metrics and demonstrate that it can lead to performance improvements in biometric applications that rely on face image modality. We lter the main academic datasets using the proposed ltering strategy and present performance metrics.

Author(s): Iurii Medvedev and Nuno Gonçalves
//
Featured In: Workshop on Interdisciplinary Applications of Biometrics and Identity Science (INTERID’2023), Hawaii
//
Publication Type: Conference Papers
//
DOI: 10.1109/FG57933.2023.10042579
//
Year: 2023
//
View File
//
Visit Website

QualFace: Adapting Deep Learning Face Recognition for ID and Travel Doc with Quality Assessment

Modern face recognition biometrics widely rely on deep neural networks that are usually trained on large collections of wild face images of celebrities. This choice of the data is related with its public availability in a situation when existing ID document compliant face image datasets (usually stored by national institutions) are hardly accessible due to continuously increasing privacy restrictions. However this may lead to a leak in performance in systems developed specifically for ID document compliant images. In this work we proposed a novel face recognition approach for mitigating that problem. To adapt deep face recognition network for document security purposes, we propose to regularise the training process with specific sample mining strategy which penalises the samples by their estimated quality, where the quality metric is proposed by our work and is related to the specific case of face images for ID documents. We perform extensive experiments and demonstrate the efficiency of proposed approach for ID document compliant face images.

Author(s): João Tremoço, Iurii Medvedev, Nuno Gonçalves
//
Featured In: BIOSIG 2021
//
Publication Type: Conference Papers
//
Year: 2021
//
View File
//
Visit Website

Face depth prediction by the scene depth

Depth map, also known as range image, can directly reflect the geometric shape of the objects. Due to several issues such as cost, privacy and accessibility, face depth information is not easy to obtain. However, the spatial information of faces is very important in many aspects of computer vision especially in the biometric identification. In contrast, scene depth information is related easier to obtain with the development of autonomous driving technology in recent years. An idea of face depth estimation inspired is to bridge the gap between the scene depth and the face depth. Previously, face depth estimation and scene depth estimation were treated as two completely separate domains. This paper proposes and explores utilizing scene depth knowledge learned to estimate the depth map of faces from monocular 2D images. Through experiments, we have preliminarily verified the possibility of using scene depth knowledge to predict the depth of faces and its potential in face feature representation.

Author(s): Bo Jin, Leandro Cruz, Nuno Gonc ̧alves
//
Featured In: IEEE/ACIS 20th International Conference on Computer and Information Science, Shanghai, China
//
Publication Type: Conference Papers
//
DOI: 10.1109/ICIS51600.2021.9516598
//
Year: 2021
//
View File
//
Visit Website

Cost Volume Refinement for Depth Prediction

Light-field cameras are becoming more popular in the consumer market. Their data redundancy allows, in theory, to accurately refocus images after acquisition and to predict the depth of each point visible from the camera. Combined, these two features allow for the generation of full-focus images, which is impossible in traditional cameras. Multiple methods for depth prediction from light fields (or stereo) have been proposed over the years. A large subset of these methods relies on cost-volume estimates 3D objects where each layer represents a heuristic of whether each point in the image is at a certain distance from the camera. Generally, this volume is used to regress a depth map, which is then refined for better results. In this paper, we argue that refining the cost volumes is superior to refining the depth maps in order to further increase the accuracy of depth predictions. We propose a set of cost-volume refinement algorithms and show their effectiveness.

Author(s): João L. Cardoso, Nuno Gonçalves and Michael Wimmer
//
Featured In: 25th International Conference on Pattern Recognition (ICPR) Milan, Italy
//
Publication Type: Conference Papers
//
DOI: 10.1109/ICPR48806.2021.9412730
//
Year: 2021
//
View File
//
Visit Website

Biometric System for Mobile Validation of ID And Travel Documents

Current trends in security of ID and travel documents require portable and efficient validation applications that rely on biometric recognition. Such tools can allow any authority and citizen to validate documents and authenticate citizens with no need of expensive and sometimes unavailable proprietary devices. In this work, we present a novel, compact and efficient approach of validating ID and travel documents for offline mobile applications. The approach employs the in-house biometric template that is extracted from the original portrait photo (either full frontal or token frontal), and then stored on the ID document with use of a machine readable code (MRC). The ID document can then be validated with a developed application on a mobile device with digital camera. The similarity score is estimated with use of an artificial neural network (ANN). Results show that we achieve validation accuracy up to 99.5% with corresponding false match rate = 0.0047 and false non-match rate = 0.00034. (CITATION: I. Medvedev, N. Gonçalves and L. Cruz, "Biometric System for Mobile Validation of ID And Travel Documents," 2020 International Conference of the Biometrics Special Interest Group (BIOSIG), 2020, pp. 1-5.)

Author(s): Iurii Medvedev, Nuno Gonçalves, Leandro Cruz
//
Featured In: 2020 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, pp. 1-5
//
Publication Type: Conference Papers
//
Year: 2020
//
View File
//
Visit Website

Multimodal Deep-Learning for Object Recognition Combining Camera and LIDAR Data

Object detection and recognition is a key component of autonomous robotic vehicles, as evidenced by the continuous efforts made by the robotic community on areas related to object detection and sensory perception systems. This paper presents a study on multisensor (camera and LIDAR) late fusion strategies for object recognition. In this work, LIDAR data is processed as 3D points and also by means of a 2D representation in the form of depth map (DM), which is obtained by projecting the LIDAR 3D point cloud into a 2D image plane followed by an upsampling strategy which generates a high-resolution 2D range view. A CNN network (Inception V3) is used as classification method on the RGB images, and on the DMs (LIDAR modality). A 3D- network (the PointNet), which directly performs classification on the 3D point clouds, is also considered in the experiments. One of the motivations of this work consists of incorporating the distance to the objects, as measured by the LIDAR, as a relevant cue to improve the classification performance. A new range- based average weighting strategy is proposed, which considers the relationship between the deep-models’ performance and the distance of objects. A classification dataset, based on the KITTI database, is used to evaluate the deep-models, and to support the experimental part. We report extensive results in terms of single modality i.e., using RGB and LIDAR models individually, and late fusion multimodality approaches. (CITATION: Gledson Melotti, Cristiano Premebida, Nuno Gonçalves, "Multimodal Deep-Learning for Object Recognition Combining Camera and LIDAR Data" 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal, pp. 177-182)

Author(s): Gledson Melotti, Cristiano Premebida, Nuno Gonçalves
//
Featured In: 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal, pp. 177-182
//
Publication Type: Conference Papers
//
DOI: 10.1109/ICARSC49921.2020.9096138
//
Year: 2020
//
View File
//
Visit Website

Probabilistic Object Classification using CNN ML-MAP layers

Deep networks are currently the state-of-the-art for sensory perception in autonomous driving and robotics. However, deep models often generate overconfident predictions precluding proper probabilistic interpretation which we argue is due to the nature of the SoftMax layer. To reduce the overconfidence without compromising the classification performance, we introduce a CNN probabilistic approach based on dis- tributions calculated in the network’s Logit layer. The approach enables Bayesian inference by means of ML and MAP layers. Experiments with calibrated and the proposed prediction layers are carried out on object classification using data from the KITTI database. Results are reported for camera (RGB) and LiDAR (range-view) modalities, where the new approach shows promising performance compared to SoftMax.

Author(s): Gledson Melotti, Cristiano Premebida, Jordan Bird, Diego Faria, and Nuno Gonçalves
//
Featured In: ECCV Workshop on Perception for Autonomous Driving (PAD)
//
Publication Type: Conference Papers
//
Year: 2020
//
View File
//
Visit Website

Object detection in traffic scenarios - a comparison of traditional and deep learning approaches

In the area of computer vision, research on object detection algorithms has grown rapidly as it is the fundamental step for automation, specifically for self-driving vehicles. This work presents a comparison of traditional and deep learning approaches for the task of object detection in traffic scenarios. The handcrafted feature descriptor like Histogram of oriented Gradients (HOG) with a linear Support Vector Machine (SVM) classifier is compared with deep learning approaches like Single Shot Detector (SSD) and You Only Look Once (YOLO), in terms of mean Average Precision (mAP) and processing speed. SSD algorithm is implemented with different backbone architectures like VGG16, MobileNetV2 and ResNeXt50, similarly YOLO algorithm with MobileNetV1 and ResNet50, to compare the performance of the approaches. The training and inference is performed on PASCAL VOC 2007 and 2012 training, and PASCAL VOC 2007 test data respectively. We consider five classes relevant for traffic scenarios, namely, bicycle, bus, car, motorbike and person for the calculation of mAP. Both qualitative and quantitative results are presented for comparison. For the task of object detection, the deep learning approaches outperform the traditional approach both in accuracy and speed. This is achieved at the cost of requiring large amount of data, high computation power and time to train a deep learning approach.

Author(s): Gopi K. Erabati, Nuno Gonçalves, and Helder Araújo
//
Featured In: Computer Science & Information Technology, AIRCC Publishing Corporation
//
Publication Type: Conference Papers
//
DOI: 10.5121/csit.2020.100918
//
Year: 2020
//
View File
//
Visit Website

Deep-Learning based Global and Semantic Feature Fusion for Indoor Scene Classification

This paper focuses on the task of RGB indoor scene classification. A single scene may contain various configurations and points of view, but there are a small number of objects that can characterize the scene. In this paper we propose a deep-learning based Global and Semantic Feature Fusion Approach (GSF 2 App) with two branches. In the first branch (top branch), a CNN model is trained to extract global features from RGB images, taking leverage from the ImageNet pre-trained model to initialize our CNN's weights. In the second branch (bottom branch), we develop a semantic feature vector that represents the objects in the image, which are detected and classified through the COCO dataset pre-trained YOLOv3 model. Then, both global and semantic features are combined in an intermediate feature fusion stage. The proposed approach was evaluated on the SUN RGB-D Dataset and NYU Depth Dataset V2 achieving state-of-the-art results on both datasets.

Author(s): Ricardo Pereira; Nuno Gonçalves; Luís Garrote; Tiago Barros; Ana Lopes; Urbano J. Nunes
//
Featured In: 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal
//
Publication Type: Conference Papers
//
DOI: 10.1109/ICARSC49921.2020.9096068
//
Year: 2020
//
View File
//
Visit Website

UniqueMark - A method to create and authenticate a unique mark in precious metal artefacts

The project UniqueMark aims at creating a system to provide a precious metal artefact with a unique, unclonable and irreproducible mark, and at building a system for the validation of authenticity of it. This system verifies the mark authenticity using a microscope, or a smartphone camera with attached macro lens.

Author(s): Nuno Gonçalves, Leandro Cruz
//
Featured In: Jewellery Materials Congress 2019
//
Publication Type: Conference Papers
//
Year: 2019
//
View File

A Content-aware Filtering for RGBD Faces

A content-aware filtering for 2.5D meshes of faces that preserves their intrinsic features. We take advantage of prior knowledge of the models (faces) to improve the comparison. The model is invariant to depth translation and scale. The proposed method is evaluated on a public 3D face dataset with different levels of noise. The results show that the method is able to remove noise without smoothing the sharp features of the face.

Author(s): Leandro Dihl, Leandro Cruz, Nuno Monteiro, Nuno Gonçalves
//
Featured In: GRAPP 2019 - International Conference on Computer Graphics Theory and Applications
//
Publication Type: Conference Papers
//
Year: 2019
//
View File

Large Scale Information Marker Coding for Augmented Reality Using Graphic Code

The main advantage of using this approach as an Augmented Reality marker is the possibility of creating generic applications that can read and decode these Graphic Code markers, which might contain 3D models and complex scenes encoded in it. Additionally, the resulting marker has strong aesthetic characteristics associated to it once it is generated from any chosen base image.

Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
//
Featured In: IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
//
Publication Type: Conference Papers
//
Year: 2018
//
View File

Graphic Code: a New Machine Readable Approach

Graphic Code has two major advantages over classical MRCs: aesthetics and larger coding capacity. It opens new possibilities for several purposes such as identification, tracking (using a specific border), and transferring of content to the application. This paper focuses on presenting how graphic code can be used for industry applications, emphasizing its uses on Augmented Reality (AR).

Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
//
Featured In: IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
//
Publication Type: Conference Papers
//
DOI: 10.1109/AIVR.2018.00036
//
Year: 2018
//
View File

Uniquemark: A computer vision system for hallmarks authentication

Uniquemark is a vision system for authentication based on random marks, particularly hallmarks. Hallmarks are worldwide used to authenticate and attest the legal fineness of precious metal artefacts. Our authentication method is based on a multiclass classifier model that uses mark descriptor composed by several geometric features of the particles.

Author(s): Ricardo Barata, Leandro Cruz, Bruno Patrão, Nuno Gonçalves
//
Featured In: Recpad 2018-24th Portuguese Conference on Pattern Recognition
//
Publication Type: Conference Papers
//
Year: 2018
//
View File

Halftone Pattern: A New Steganographic Approach

Presentation of a steganographic technique to hide a textual information into an image. It is inspired by the use of dithering to create halftone images. It begins from a base image and creates the coded image by associating each base image pixel to a set of two-colors pixels (halftone) forming an appropriate pattern. The coded image is a machine readable information, with good aesthetic, secure and containing data redundancy and compression.

Author(s): Bruno Patrão, Leandro Cruz, Nuno Gonçalves
//
Featured In: Eurographics 2018
//
Publication Type: Conference Papers
//
Year: 2018
//
View File

An Application of a Halftone Pattern Coding in Augmented Reality

Presentation of a coding system using a halftone pattern (with black and white pixels) capable to be integrated into markers that encode information that can be retrieved a posteriori and used in the creation of augmented reality applications. These markers can be easily detected in a photo and the encoded information is the basis for parameterizing various types of augmented reality applications.

Author(s): Bruno Patrão, Leandro Cruz, Nuno Gonçalves
//
Featured In: SIGGRAPH Asia 2017
//
Publication Type: Conference Papers
//
DOI: 10.1145/3145690.3145705
//
Year: 2017
//
View File